Skip to content

feat(sdk): ingredient JUMBF archives, archive metadata typing#2007

Open
gpeacock wants to merge 1 commit intomainfrom
gpeacock/archive_ingredients
Open

feat(sdk): ingredient JUMBF archives, archive metadata typing#2007
gpeacock wants to merge 1 commit intomainfrom
gpeacock/archive_ingredients

Conversation

@gpeacock
Copy link
Copy Markdown
Member

@gpeacock gpeacock commented Apr 2, 2026

  • Add Builder write_ingredient_archive(ingredient_id) and add_ingredient_from_archive
  • Tag JUMBF archives with org.contentauth.archive.metadata (archive:type builder|ingredient)
  • Add labels ARCHIVE_METADATA, ARCHIVE_TYPE_BUILDER, ARCHIVE_TYPE_INGREDIENT
  • Add INGREDIENT_ARCHIVE_MIME; to_archive uses builder archive type
  • Internal ArchiveKind + working_store_sign(kind); scoped one-ingredient claim via scoped_for_ingredient_archive
  • Preserve parent claim thumbnail and copy assertions (exclude archive metadata & box hash); merge archive resources on import
  • Reader::active_archive_type for ingredient-archive checks
  • Test round-trip with claim thumbnail and resource merge

- Add  Builder write_ingredient_archive(ingredient_id) and add_ingredient_from_archive
- Tag JUMBF archives with org.contentauth.archive.metadata (archive:type builder|ingredient)
- Add labels ARCHIVE_METADATA, ARCHIVE_TYPE_BUILDER, ARCHIVE_TYPE_INGREDIENT
- Add INGREDIENT_ARCHIVE_MIME; to_archive uses builder archive type
- Internal ArchiveKind + working_store_sign(kind); scoped one-ingredient claim via scoped_for_ingredient_archive
- Preserve parent claim thumbnail and copy assertions (exclude archive metadata & box hash); merge archive resources on import
- Reader::active_archive_type for ingredient-archive checks
- Test round-trip with claim thumbnail and resource merge
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 2, 2026

Codecov Report

❌ Patch coverage is 81.19658% with 22 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.46%. Comparing base (e0e4cc2) to head (4bc536b).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
sdk/src/builder.rs 79.43% 22 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##             main    #2007    +/-   ##
========================================
  Coverage   77.45%   77.46%            
========================================
  Files         176      176            
  Lines       44142    44256   +114     
========================================
+ Hits        34192    34284    +92     
- Misses       9950     9972    +22     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq bot commented Apr 2, 2026

Merging this PR will not alter performance

✅ 32 untouched benchmarks
⏩ 2 skipped benchmarks1


Comparing gpeacock/archive_ingredients (4bc536b) with main (122110d)

Open in CodSpeed

Footnotes

  1. 2 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@gpeacock gpeacock requested review from ok-nick, scouten-adobe and tmathern and removed request for tmathern April 2, 2026 18:43
Comment on lines +3313 to +3316
let json = format!(
r#"{{"@context":{{"archive":"https://contentauth.org/ns/archive#"}},"archive:type":"{archive_type}"}}"#
);
let archive_metadata = Metadata::new(labels::ARCHIVE_METADATA, &json)?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this link supposed to point to?

Comment on lines +3348 to +3351
.filter(|a| {
let (base, _, _) = parse_label(a.label());
base != labels::ARCHIVE_METADATA && base != labels::BOX_HASH
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come we filter for archive metadata and only box hash? If it's an archive metadata should we error since it shouldn't be there?

Comment on lines +1257 to +1266
pub(crate) fn active_archive_type(&self) -> Option<String> {
let manifest = self.active_manifest()?;
let metadata: Metadata = manifest
.find_assertion(crate::assertions::labels::ARCHIVE_METADATA)
.ok()?;
metadata
.value
.get("archive:type")
.and_then(|v: &Value| v.as_str().map(str::to_owned))
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return enum here?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

///
/// For other `application/c2pa` stores, use [`Self::add_ingredient_from_reader`] or
/// [`Self::add_ingredient_from_stream`].
#[async_generic]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, should we have async variants of this function? Is there any async behavior we can expect?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The progress report, maybe, in future, could become async?

/// distinguished from a full builder archive when reading with [`Self::add_ingredient_from_archive`].
///
/// The exported manifest is **not** a lossless slice of the parent: it uses one cloned ingredient
/// and a fresh claim instance id; other ingredients are omitted.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about ingredients the ingredient contains?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"and a fresh claim instance id"?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shallow is the right contract. The compound content model in specs-core #2058 treats componentOf ingredients as references to independently signed child assets. A child's provenance chain lives in the child's manifest store, which standard ingredient handling copies into the parent. Sub-ingredients are not nested on the componentOf definition itself, they resolve from the manifest store at validation time.

write_ingredient_archive needs to carry the ingredient definition and its directly associated resources. Recursive sub-ingredient walking would conflate the ingredient reference with the manifest store history.

That said, scoped_for_ingredient_archive clones the entire resource store (self.resources.clone() at line 3357), which is broader than needed. @tmathern's suggestion to scope resources to the target ingredient and claim-level refs (thumbnail) is a tighter contract and avoids carrying unrelated sibling data.

On INGREDIENT_ARCHIVE_MIME as format (line 3356): format is removed in claim v2. The org.contentauth.archive.metadata assertion already carries archive:type, so the format field is redundant. Gating the assignment on claim_version or dropping it entirely avoids a forward-compat issue.

.collect();
scoped.definition.ingredients = vec![ingredient.clone()];
scoped.resources = self.resources.clone();
scoped.definition.format = INGREDIENT_ARCHIVE_MIME.to_string();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have a custom format here? It's removed in claim v2.

pub const ARCHIVE_METADATA: &str = "org.contentauth.archive.metadata";

/// `archive:type` value for a full manifest [`Builder`](crate::Builder) working-store archive (JUMBF).
pub const ARCHIVE_TYPE_BUILDER: &str = "builder";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not "workingStore"?

/// distinguished from a full builder archive when reading with [`Self::add_ingredient_from_archive`].
///
/// The exported manifest is **not** a lossless slice of the parent: it uses one cloned ingredient
/// and a fresh claim instance id; other ingredients are omitted.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"and a fresh claim instance id"?

///
/// For other `application/c2pa` stores, use [`Self::add_ingredient_from_reader`] or
/// [`Self::add_ingredient_from_stream`].
#[async_generic]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The progress report, maybe, in future, could become async?

};

let archive_type = kind.archive_type_str();
let json = format!(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't serde handle that formating/json?

Comment on lines +1257 to +1266
pub(crate) fn active_archive_type(&self) -> Option<String> {
let manifest = self.active_manifest()?;
let metadata: Metadata = manifest
.find_assertion(crate::assertions::labels::ARCHIVE_METADATA)
.ok()?;
metadata
.value
.get("archive:type")
.and_then(|v: &Value| v.as_str().map(str::to_owned))
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

.cloned()
.collect();
scoped.definition.ingredients = vec![ingredient.clone()];
scoped.resources = self.resources.clone();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just clone the resources linked to that ingredient?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants